V0.9.2 log trace #26283

wanx7130 · 2025-10-06T08:28:16Z

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

…#20412)" This reverts commit e202dd2.

github-actions · 2025-10-06T08:28:26Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

mergify · 2025-10-06T08:29:06Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @wanx7130.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

gemini-code-assist

Code Review

This pull request introduces significant changes to enable v0 execution on CPU, XPU, and TPU backends, including new worker implementations, model runners, and attention backends. It also includes refactoring of existing backends for better consistency and adds some debug logging. My review found a critical issue in the new CPU MLA backend implementation that would likely cause runtime errors.

gemini-code-assist · 2025-10-06T08:30:27Z

vllm/attention/backends/cpu_mla.py

+            slot_mapping=slot_mapping,
+            multi_modal_placeholder_index_maps=placeholder_index_maps,
+            enable_kv_scales_calculation=False,
+            input_positions=torch.tensor([self.input_data.input_positions]))


The input_positions tensor is being created as a 2D tensor by wrapping self.input_data.input_positions in an extra list. The rotary embedding layer expects a 1D tensor for positions. This will likely lead to an indexing error or incorrect behavior at runtime.

Suggested change

input_positions=torch.tensor([self.input_data.input_positions]))

input_positions=torch.tensor(self.input_data.input_positions))

mergify · 2025-10-06T08:31:38Z

⚠️ The sha of the head commit of this PR conflicts with #26284. Mergify cannot evaluate rules on this PR. ⚠️

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

chatgpt-codex-connector · 2025-10-06T08:32:12Z

vllm/worker/cpu_model_runner.py

+        if self.runner.attn_backend is not None:
+            # spec decode (e.g. Medusa) does not have atten backend
+            attn_backend = self.runner.attn_backend
+            self.att_metadata_builder = attn_backend.get_builder_cls()(self)
+
+    def prepare(self,
+                finished_requests_ids: Optional[List[str]] = None) -> None:
+        self.seq_group_metadata_list: List[SequenceGroupMetadata] = []
+        self.input_data = ModelInputForCPUBuilder.ModelInputData(
+            self.runner.model_config.uses_mrope)
+        self.att_metadata_builder.prepare()


Guard missing when no attention backend is present

In ModelInputForCPUBuilder.prepare() the code unconditionally calls self.att_metadata_builder.prepare(), but self.att_metadata_builder is only created when self.runner.attn_backend is not None (see the comment about speculative decoding having no attention backend). When the runner is used for configurations that do not build an attention backend, prepare() will raise an AttributeError before any inputs are built, breaking CPU speculative decoding. Add a null check before invoking the builder or skip metadata preparation when no attention backend exists.

Useful? React with 👍 / 👎.

simon-mo and others added 2 commits July 6, 2025 14:02

Revert "[V0 deprecation] Remove V0 CPU/XPU/TPU backends (vllm-project…

a5dd03c

…#20412)" This reverts commit e202dd2.

add log for request trace

bdc22d5

wanx7130 requested review from NickLucche, jikunshang, bigPYJ1151, mgoin, tlrmchlsmth, WoosukKwon, yewentao256, robertgshaw2-redhat, njhill, ywang96, comaniac, alexm-redhat, heheda12345, ApostaC and LucasWilkinson as code owners October 6, 2025 08:28

mergify bot added documentation Improvements or additions to documentation ci/build labels Oct 6, 2025

wanx7130 closed this Oct 6, 2025

mergify bot added v1 tpu Related to Google TPUs labels Oct 6, 2025

mergify bot added needs-rebase kv-connector labels Oct 6, 2025

gemini-code-assist bot reviewed Oct 6, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Oct 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

V0.9.2 log trace #26283

V0.9.2 log trace #26283

Uh oh!

wanx7130 commented Oct 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Oct 6, 2025

Uh oh!

mergify bot commented Oct 6, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 6, 2025

Uh oh!

mergify bot commented Oct 6, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 6, 2025

Uh oh!

Uh oh!

	input_positions=torch.tensor([self.input_data.input_positions]))
	input_positions=torch.tensor(self.input_data.input_positions))

Uh oh!

V0.9.2 log trace #26283

V0.9.2 log trace #26283

Uh oh!

Conversation

wanx7130 commented Oct 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Oct 6, 2025

Uh oh!

mergify bot commented Oct 6, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Oct 6, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wanx7130 commented Oct 6, 2025 •

edited by github-actions bot

Loading